False lock-order-inversion reports for locks taken in a single thread #488

ramosian-glider · 2015-09-01T11:12:47Z

Originally reported on Google Code with ID 81

Consider there're two locks, L1 and L2, and a thread T1 that does the following:
 lock(L1)
 lock(L2)
 unlock(L1)
 lock(L1)
 unlock(L1)
 unlock(L2)

, while other threads may (or may not) acquire and release L1 and L2 (but not both)
at the same time.
Based on the lock acquisition pattern in T1 ThreadSanitizer will report a lock-order-inversion
(potential deadlock), although there's no possibility of a deadlock as long as other
threads do not take both L1 and L2.
The error can even be reported in a single-threaded process, which isn't correct at
all.

Reported by glider@chromium.org on 2014-10-09 09:25:52

The text was updated successfully, but these errors were encountered:

ramosian-glider · 2015-09-01T11:12:55Z

Example: https://code.google.com/p/chromium/issues/detail?id=412097

Reported by glider@google.com on 2014-10-09 09:27:28

ramosian-glider · 2015-09-01T11:13:04Z

Another one: https://code.google.com/p/v8/source/browse/trunk/test/cctest/test-lockers.cc?r=24476#483

Run() acquires the first lock at line 485 and the second lock at line 499, creating
the L1->L2 edge in the lock acquisition graph.
Then first lock is unlocked at line 510 and locked again at line 517 (when the unlocker
dies), creating the L2->L1 edge.

Reported by glider@google.com on 2014-10-09 09:35:08

ramosian-glider · 2015-09-01T11:13:13Z

We can probably filter such reports out if we prove that each edge has been created
by the same thread. This may require keeping the list of threads that acquired L_i
while holding L_j for each (L_i, L_j).

Reported by glider@google.com on 2014-10-09 09:37:06

ramosian-glider · 2015-09-01T11:13:21Z

Reported by dvyukov@google.com on 2014-10-09 09:52:07

Status changed: Accepted

ramosian-glider · 2015-09-01T11:13:30Z

>> We can probably filter such reports out if we prove that each edge has been created
by the same thread.

We shouldn't imho...
What if the locking happens in a worker thread and just because the test is small there
is a single worker (but the large app will have more workers)


Are you sure the pattern in #0 is good coding practice?
I would rather annotate that code (btw, do we have annotations?) than try to fix the
detector.

Reported by konstantin.s.serebryany on 2014-10-09 18:37:59

ramosian-glider · 2015-09-01T11:13:39Z

> What if the locking happens in a worker thread and just because the test is small
there is a single worker (but the large app will have more workers)

It can happen with a data race as well, tsan won't report racy accesses that happen
to be in the same thread. And asan won't report a racy UAF if use happens before free.
Our general strategy to-date is to sacrifice some potential true positives to not have
false positives; and require tests to expose bugs -- for any concurrency-related testing
(races, deadlocks, etc) it means that the test must be multi-threaded.
I don't understand why we need to deviate from this line in this case.
Some of these cases involve pre-built libraries, so annotations are not possible. And
suppressions have high chances of suppressing true deadlocks.

Reported by dvyukov@google.com on 2014-10-09 20:04:26

ramosian-glider · 2015-09-01T11:13:47Z

I previously worked on a Java deadlock detector (JCarder) and provided this as an option.
The default was to report all cycles, but there was an option to only include multi-threaded
cycles in the output. In using jcarder on some large Java projects, I found it helpful
to have both options available.

Reported by todd@cloudera.com on 2015-01-08 04:05:21

ramosian-glider · 2015-09-01T11:13:56Z

Hi Todd,

Thanks for sharing your experience!

Can you provide an example of where it is useful to report "single-threaded deadlocks"?
My concern is that developers frequently apply the tool to large projects with lots
of external dependencies; in such context the tool ideally works w/o any tuning (potentially
sacrificing some percent of true positives for absence of false positives); or uses
some form of per-function/file annotations if absolutely necessary. I do not really
see how a global setting can be useful in such contexts.

Reported by dvyukov@google.com on 2015-01-19 08:35:21

ramosian-glider · 2015-09-01T11:14:05Z

I agree that in general, it's best to optimize to avoid false positives.

The reason that the "single-threaded deadlocks" are useful is just what the earlier
commenter mentioned. We run the tool on our unit test suite, and often times unit tests
don't generate enough multi-threaded concurrency to cause thread pools to exercise
multiple threads in places like RPC handler pools.

Another case I've found is that, even though today the code might be safe, it's still
the case that different methods are taking a pair of locks in inverted order. If the
code changes a bit in the future (or users call library methods differently than the
unit tests) then you are likely to hit a deadlock. It's generally a best practice to
have a strict specified order of locks, and deviation from it is a code smell which
many teams would want to eliminate early on.

So, I think it's best to allow a global setting, and also allow suppressions. Then,
in many projects it's feasible to enable the global setting and either fix or suppress
the few false positives that come out. If a project uses lots and lots of dependencies
that exhibit this sort of problem, they can disable the reporting of single-threaded
deadlocks.

Reported by tlipcon on 2015-01-19 08:43:18

ramosian-glider · 2015-09-01T11:14:13Z

OK, makes sense, thanks!

Reported by dvyukov@google.com on 2015-01-19 08:50:21

ramosian-glider · 2015-09-01T11:14:22Z

Adding Project:ThreadSanitizer as part of GitHub migration.

Reported by glider@google.com on 2015-07-30 09:21:32

Labels added: ProjectThreadSanitizer

dvyukov · 2015-09-17T08:35:05Z

Just got another offline complain about this.
Tsan reports false lock order inversion in the single thread in V8. The second mutex is taken only because there is multi-threaded component internally protected with a mutex, but this component is used on on the single thread.

kcc · 2015-09-17T16:12:58Z

On Thu, Sep 17, 2015 at 1:35 AM, Dmitry Vyukov notifications@github.com
wrote:

Just got another offline complain about this.
Tsan reports false lock order inversion in the single thread in V8.

Why is it "false"?
benign -- most likely. But not false.

Won't suppressions work for this case?

The second mutex is taken only because there is multi-threaded component
internally protected with a mutex, but this component is used on on the
single thread.

—
Reply to this email directly or view it on GitHub
#488 (comment).

dvyukov · 2015-09-17T16:22:19Z

Single thread either deadlocks or not. There is no "it did not deadlock now, but potentially can". Tsan does this extrapolation, and it is false.

kcc · 2015-09-17T16:40:15Z

On Thu, Sep 17, 2015 at 9:22 AM, Dmitry Vyukov notifications@github.com
wrote:

Single thread either deadlocks or not. There is no "it did not deadlock
now, but potentially can". Tsan does this extrapolation, and it is false.

What if we have a thread pool that in a small test always schedules two
tasks in the same thread,
but in a large system will not?

I don't mind a flag that will ignore such reports.
By default, these reports should be printed with some explanation why and a
mention of the flag.

(I may not have time to implement it any time soon :(

—
Reply to this email directly or view it on GitHub
#488 (comment).

dvyukov · 2015-09-17T16:46:14Z

Should we do the same for races? That is, if a thread accesses a variable twice we can print a bug report saying that what if these accesses will happen in different threads in future. Does not make sense to me. False positives kill trust and experience.

kcc · 2015-09-17T16:50:00Z

On Thu, Sep 17, 2015 at 9:46 AM, Dmitry Vyukov notifications@github.com
wrote:

Should we do the same for races? That is, if a thread accesses a variable
twice we can print a bug report saying that what if these accesses will
happen in different threads in future.

Does not make sense to me. False positives kill trust and experience.

Well, ok, if you feel that strongly -- do it.

—
Reply to this email directly or view it on GitHub
#488 (comment).

dvyukov · 2015-09-17T16:55:05Z

That's the most popular complain about deadlock detector.
Nobody complains about race reports in such way. So we can do the same for deadlocks: while it is all in a single thread -- silence, if the cycle involves multiple threads -- report.

More precise semantics: each subsequent edge in mutex cycle must on a different thread than the previous one.

kcc · 2015-09-18T05:01:24Z

On Thu, Sep 17, 2015 at 9:55 AM, Dmitry Vyukov notifications@github.com
wrote:

That's the most popular complain about deadlock detector.
Nobody complains about race reports in such way. So we can do the same for
deadlocks: while it is all in a single thread -- silence, if the cycle
involves multiple threads -- report.

More precise semantics: each subsequent edge in mutex cycle must on a
different thread than the previous one.

Go for it!
If you want me to do this -- I can't promise anything before mid Oct.

—
Reply to this email directly or view it on GitHub
#488 (comment).

morehouse · 2018-06-06T22:51:19Z

@dvyukov: Any progress on this?

dvyukov · 2018-06-07T05:32:20Z

No progress. Still relevant.

jprotze · 2020-02-10T15:13:50Z

This issue is also triggered with file i/o in libgfortran.

choller · 2020-07-16T13:05:27Z

We have been seeing this on mozilla-central more than once now. I really think the deadlock detector should check if only a single thread is involved and not report in that case.

SteffenKoehlerVector · 2022-09-20T13:54:46Z

@dvyukov Is there any progress on this one?

We have such a finding in our code base and were very confused as to why a different lock order inside the same thread would be a potential deadlock. So I would also classify this as a "false positive".

If you argue that you would want this finding to be able to detect such scenarios in test suites that actually don't run into that issue, I would say your test suite has test gaps -> you can't complain about a test suite not detecting issues if you did not test the functionality (e.g. does my thread pool work if I have more than just a single worker).

At the end, the sanitizer is supposed to analyze a specific binary at hand. If that binary suffers from an issue, ideally find that and report it. If this binary does not suffer from the issue, nothing should be reported.

dvyukov · 2022-09-20T13:55:34Z

no progress done/planned

ramosian-glider assigned google Sep 1, 2015

ramosian-glider added Type-Defect Priority-Medium Status-Accepted ProjectThreadSanitizer labels Sep 1, 2015

morehouse assigned dvyukov and unassigned google Jun 6, 2018

dvyukov mentioned this issue Jun 25, 2021

Thread Sanitizer false positive on lock-order-inversion (potential deadlock) #1419

Open

False lock-order-inversion reports for locks taken in a single thread #488

False lock-order-inversion reports for locks taken in a single thread #488

Comments

ramosian-glider commented Sep 1, 2015

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

ramosian-glider commented Sep 1, 2015

Uh oh!

dvyukov commented Sep 17, 2015

Uh oh!

kcc commented Sep 17, 2015

Uh oh!

dvyukov commented Sep 17, 2015

Uh oh!

kcc commented Sep 17, 2015

Uh oh!

dvyukov commented Sep 17, 2015

Uh oh!

kcc commented Sep 17, 2015

Uh oh!

dvyukov commented Sep 17, 2015

Uh oh!

kcc commented Sep 18, 2015

Uh oh!

morehouse commented Jun 6, 2018

Uh oh!

dvyukov commented Jun 7, 2018

Uh oh!

jprotze commented Feb 10, 2020

Uh oh!

choller commented Jul 16, 2020

Uh oh!

SteffenKoehlerVector commented Sep 20, 2022

Uh oh!

dvyukov commented Sep 20, 2022

Uh oh!